Data was sourced from Yahoo! Finance. We choose to focus on 27 stocks that have been in the Dow Jones Index for at least two years.
Create a dataframe of daily closing prices for each stock in the four most recent financial quarters.
# create a dataset with 27 stocks and 252 trading days
# 27 stocks (rows) and 252 returns (columns/features/predictors)
companies.closings <- matrix(data = NA, nrow = length(companies),
ncol = length(dataMMM$MMM.Close))
for (i in 1:length(companies.df)){
companies.closings[i,] <- as.numeric(companies.df[[i]][,4]) # closings are on the 4th column
}
# change the names of the rows
rownames(companies.closings) <- companies
# take the transpose
# each row is a trading day with 29 different stock prices
# each column is a stock
companies.closings.t <- t(companies.closings)
day <- c(1:nrow(companies.closings.t))
df = as.data.frame(cbind(day, companies.closings.t))
asset1 <- plot_ly(data = df, x = ~day, y = ~MMM, name = 'MMM', type = 'scatter', mode = 'lines',
line = list(color = 'rgb(1, 1, 1)'))
for (i in 2:27){
asset1 <- asset1 %>% add_trace(y = df[,i], name = companies[i], line = list(color = 'rgb(i, i, i)'))
}
The plot below displays tends for all 27 stocks prices in the past four fiscal quarters (July 1st, 2019 to June 30th, 2020). Click once on the company name on the legend to hind its respective line. Double click on a company to isolate it. Use the buttons on the top right of the plot to zoom, pan, navigate, and compare closing prices for companies.
asset1 <- plot_ly(data = df, x = ~day, y = ~MMM, name = 'MMM',
type = 'scatter', mode = 'lines', line = list(color = 'rgb(1, 1, 1)'))
for (i in 2:27){
asset1 <- asset1 %>% add_trace(y = df[,i], name = companies[i], line = list(color = 'rgb(i, i, i)'))
}
asset1 <- asset1 %>%
add_trace(x = 0, y = c(0, 400), name = '2019 - Q3',
line = list(color = 'rgb(100, 100, 100)', dash = 'dash')) %>%
add_trace(x = nrow(df)/4, y = c(0, 400), name = '2019 - Q4',
line = list(color = 'rgb(100, 100, 100)', dash = 'dash')) %>%
add_trace(x = 2*nrow(df)/4, y = c(0, 400), name = '2020 - Q1',
line = list(color = 'rgb(100, 100, 100)', dash = 'dash')) %>%
add_trace(x = 3*nrow(df)/4, y = c(0, 400), name = '2020 - Q2',
line = list(color = 'rgb(100, 100, 100)', dash = 'dash')) %>%
layout(title = 'Closing Stock Prices of Dow Jones (July 1st, 2019 to June 30th, 2020)',
xaxis = list(title = 'Day', zeroline = TRUE),
yaxis = list(title = 'Closing Price ($)'))
asset1
Using principal components analysis (PCA), we have reduced the dimension of the data into just two linear components, shown in the biplots below. We see that in 2019 (top), when the US economy was functioning normally, stocks tends to not correlate with each other–the vectors of each stock radiate in all directions. However, due to COVID-19, the stocks most, if not all, companies fell. This is reflected in the biplot for 2020 (bottom) since all the vectors of each company all point in the same general direction.
# half year cutoff
half = nrow(companies.closings.t)/2
pca_2019 <- prcomp(companies.closings.t[1 : half, ], scale = TRUE, center = TRUE)
pca_2020 <- prcomp(companies.closings.t[(half + 1) : (2*half), ], scale = TRUE, center = TRUE)
par(mfrow=c(2,1))
biplot(pca_2019, main = "2019 - Q3 & Q4")
biplot(pca_2020, main = "2020 - Q1 & Q2")
par(mfrow=c(1,1))